RIDDLE: Race and ethnicity Imputation from Disease history with Deep LEarning
نویسندگان
چکیده
Anonymized electronic medical records are an increasingly popular source of research data. However, these datasets often lack race and ethnicity information. This creates problems for researchers modeling human disease, as race and ethnicity are powerful confounders for many health exposures and treatment outcomes; race and ethnicity are closely linked to population-specific genetic variation. We showed that deep neural networks generate more accurate estimates for missing racial and ethnic information than competing methods (e.g., logistic regression, random forest). RIDDLE yielded significantly better classification performance across all metrics that were considered: accuracy, cross-entropy loss (error), and area under the curve for receiver operating characteristic plots (all p < 10−6). We made specific efforts to interpret the trained neural network models to identify, quantify, and visualize medical features which are predictive of race and ethnicity. We used these characterizations of informative features to perform a systematic comparison of differential disease patterns by race and ethnicity. The fact that clinical histories are informative for imputing race and ethnicity could reflect (1) a skewed distribution of blueand white-collar professions across racial and ethnic groups, (2) uneven accessibility and subjective importance of prophylactic health, (3) possible variation in lifestyle, such as dietary habits, and (4) differences in background genetic variation which predispose to diseases.
منابع مشابه
Race and Ethnic Differences in the Associations between Cardiovascular Diseases, Anxiety, and Depression in the United States
Introduction: Although cardiovascular diseases and psychiatric disorders are linked, it is not yet known if such links are independent of comorbid medical diseases and if these associations depend on race and ethnicity. This study aimed to determine if the associations between cardiovascular diseases with general anxiety disorder (GAD) and major depressive episode (MDE) are ind...
متن کاملUsing Bayesian Imputation to Assess Racial and Ethnic Disparities in Pediatric Performance Measures.
OBJECTIVE To analyze health care disparities in pediatric quality of care measures and determine the impact of data imputation. DATA SOURCES Five HEDIS measures are calculated based on 2012 administrative data for 145,652 children in two public insurance programs in Florida. METHODS The Bayesian Improved Surname and Geocoding (BISG) imputation method is used to impute missing race and ethni...
متن کاملCombined Effects of Race and Educational Attainment on Physician Visits Over 24 Years in a National Sample of Middle-Aged and Older Americans
Background: The literature on Minorities’ Diminished Returns (MDRs) have shown worse than expected health of the members of racial and ethnic minority groups particularly Blacks. Theoretically, this effect can be in part due to weaker effects of educational attainment on preventive care and disease management in highly educated racial and ethnic minorities. Object...
متن کاملSubstance Use and the Number of Male Sexual Partners by African American and Puerto Rican Women
Background In the United States (US), there are 19 million new sexually transmitted disease (STD) infections each year. Untreated STDs can lead to serious long-term adverse health consequences, especially for young women. Centers for Disease Control and Prevention estimates that undiagnosed and untreated STDs cause at least 24,000 women in the US each year to become infertile. This clearly is a...
متن کاملMore Accurate Racial and Ethnic Codes for Medicare Administrative Data
Analyses of health care disparities in Medicare using administrative race and ethnicity data have typically been limited to Black and White beneficiaries. This is in part due to the small size of the other categories, inaccuracies in the race and ethnicity codes, and caveats that more extensive analyses would produce biased results. While previous Medicare efforts certainly improved the accurac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1707.01623 شماره
صفحات -
تاریخ انتشار 2017